netdev CI testing #6666

kuba-moo · 2024-03-27T20:02:33Z

Reusable PR for hooking netdev CI to BPF testing.

When a workqueue is created with `WQ_UNBOUND`, its work items are served by special worker-pools, whose host workers are not bound to any specific CPU. In the default configuration (i.e. when `queue_delayed_work` and friends do not specify which CPU to run the work item on), `WQ_UNBOUND` allows the work item to be executed on any CPU in the same node of the CPU it was enqueued on. While this solution potentially sacrifices locality, it avoids contention with other processes that might dominate the CPU time of the processor the work item was scheduled on. This is not just a theoretical problem: in a particular scenario misconfigured process was hogging most of the time from CPU0, leaving less than 0.5% of its CPU time to the kworker. The IDPF workqueues that were using the kworker on CPU0 suffered large completion delays as a result, causing performance degradation, timeouts and eventual system crash. Tested: * I have also run a manual test to gauge the performance improvement. The test consists of an antagonist process (`./stress --cpu 2`) consuming as much of CPU 0 as possible. This process is run under `taskset 01` to bind it to CPU0, and its priority is changed with `chrt -pQ 9900 10000 ${pid}` and `renice -n -20 ${pid}` after start. Then, the IDPF driver is forced to prefer CPU0 by editing all calls to `queue_delayed_work`, `mod_delayed_work`, etc... to use CPU 0. Finally, `ktraces` for the workqueue events are collected. Without the current patch, the antagonist process can force arbitrary delays between `workqueue_queue_work` and `workqueue_execute_start`, that in my tests were as high as `30ms`. With the current patch applied, the workqueue can be migrated to another unloaded CPU in the same node, and, keeping everything else equal, the maximum delay I could see was `6us`. Fixes: 0fe4546 ("idpf: add create vport and netdev configuration") Signed-off-by: Marco Leogrande <[email protected]> Signed-off-by: Manoj Vishwanathan <[email protected]> Signed-off-by: Brian Vazquez <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Reviewed-by: Pavan Kumar Linga <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Add more information related to the transaction like cookie, vc_op, salt when transaction times out and include similar information when transaction salt does not match. Info output for transaction timeout: ------------------- (op:5015 cookie:45fe vc_op:5015 salt:45 timeout:60000ms) ------------------- before it was: ------------------- (op 5015, 60000ms) ------------------- Signed-off-by: Manoj Vishwanathan <[email protected]> Signed-off-by: Brian Vazquez <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Reviewed-by: Pavan Kumar Linga <[email protected]> Reviewed-by: Paul Menzel <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: NipaLocal <nipa@local>

@j

Fix &ice_parser_rt::bst_key size. It was wrongly set to 10 instead of 20 in the initial impl commit (see Fixes tag). All usage code assumed it was of size 20. That was also the initial size present up to v2 of the intro series [2], but halved by v3 [3] refactor described as "Replace magic hardcoded values with macros." The introducing series was so big that some ugliness was unnoticed, same for bugs :/ ICE_BST_KEY_TCAM_SIZE and ICE_BST_TCAM_KEY_SIZE were differing by one. There was tmp variable @j in the scope of edited function, but was not used in all places. This ugliness is now gone. I'm moving ice_parser_rt::pg_prio a few positions up, to fill up one of the holes in order to compensate for the added 10 bytes to the ::bst_key, resulting in the same size of the whole as prior to the fix, and minimal changes in the offsets of the fields. Extend also the debug dump print of the key to cover all bytes. To not have string with 20 "%02x" and 20 params, switch to ice_debug_array_w_prefix(). This fix obsoletes Ahmed's attempt at [1]. [1] https://lore.kernel.org/intel-wired-lan/[email protected] [2] https://lore.kernel.org/intel-wired-lan/[email protected] [3] https://lore.kernel.org/intel-wired-lan/[email protected] Reported-by: Dan Carpenter <[email protected]> Closes: https://lore.kernel.org/intel-wired-lan/[email protected] Fixes: 9a4c07a ("ice: add parser execution main loop") CC: Ahmed Zaki <[email protected]> Reviewed-by: Larysa Zaremba <[email protected]> Signed-off-by: Przemek Kitszel <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: NipaLocal <nipa@local>

It occurred that in the commit 7083893 ("ice: Implement driver functionality to dump serdes equalizer values") the invalid DRATE parameter for reading has been added. The output of the command: $ ethtool -d <ethX> returns the garbage value in the place where DRATE value should be stored. Remove mentioned parameter to prevent return of corrupted data to userspace. Fixes: 7083893 ("ice: Implement driver functionality to dump serdes equalizer values") Signed-off-by: Mateusz Polchlopek <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Tested-by: Rinitha S <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: NipaLocal <nipa@local>

First case: > ip l a l $VF name vlanx type vlan id 100 > ip l d vlanx > ip l a l $VF name vlanx type vlan id 100 As workqueue can be execute after sometime, there is a window to have call trace like that: - iavf_del_vlan - iavf_add_vlan - iavf_del_vlans (wq) It means that our VLAN 100 will change the state from IAVF_VLAN_ACTIVE to IAVF_VLAN_REMOVE (iavf_del_vlan). After that in iavf_add_vlan state won't be changed because VLAN 100 is on the filter list. The final result is that the VLAN 100 filter isn't added in hardware (no iavf_add_vlans call). To fix that change the state if the filter wasn't removed yet directly to active. It is save as IAVF_VLAN_REMOVE means that virtchnl message wasn't sent yet. Second case: > ip l a l $VF name vlanx type vlan id 100 Any type of VF reset ex. change trust > ip l s $PF vf $VF_NUM trust on > ip l d vlanx > ip l a l $VF name vlanx type vlan id 100 In case of reset iavf driver is responsible for readding all filters that are being used. To do that all VLAN filters state are changed to IAVF_VLAN_ADD. Here is even longer window for changing VLAN state from kernel side, as workqueue isn't called immediately. We can have call trace like that: - changing to IAVF_VLAN_ADD (after reset) - iavf_del_vlan (called from kernel ops) - iavf_del_vlans (wq) Not exsisitng VLAN filters will be removed from hardware. It isn't a bug, ice driver will handle it fine. However, we can have call trace like that: - changing to IAVF_VLAN_ADD (after reset) - iavf_del_vlan (called from kernel ops) - iavf_add_vlan (called from kernel ops) - iavf_del_vlans (wq) With fix for previous case we end up with no VLAN filters in hardware. We have to remove VLAN filters if the state is IAVF_VLAN_ADD and delete VLAN was called. It is save as IAVF_VLAN_ADD means that virtchnl message wasn't sent yet. Fixes: 0c0da0e ("iavf: refactor VLAN filter states") Signed-off-by: Michal Swiatkowski <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Pointer arguments passed to ioctls need to pass through compat_ptr() to work correctly on s390; as explained in Documentation/driver-api/ioctl.rst. Detect compat mode at runtime and call compat_ptr() for those commands which do take pointer arguments. Suggested-by: Arnd Bergmann <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Fixes: d94ba80 ("ptp: Added a brand new class driver for ptp clocks.") Signed-off-by: Thomas Weißschuh <[email protected]> Reviewed-by: Cyrill Gorcunov <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Acked-by: Richard Cochran <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Originally, it was possible for the DPE length check to overflow if wDatagramIndex + wDatagramLength > U16_MAX. This could lead to an OoB read. Move the wDatagramIndex term to the other side of the inequality. An existing condition ensures that wDatagramIndex < urb->actual_length. Fixes: a2d274c ("usbnet: ipheth: add CDC NCM support") Cc: [email protected] Signed-off-by: Foster Snowhill <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

By definition, a DPE points at the start of a network frame/datagram. Thus it makes no sense for it to point at anything that's part of the NCM header. It is not a security issue, but merely an indication of a malformed DPE. Enforce that all DPEs point at the data portion of the URB, past the NCM header. Fixes: a2d274c ("usbnet: ipheth: add CDC NCM support") Cc: [email protected] Signed-off-by: Foster Snowhill <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Original code allowed for the start of NDP16 to be anywhere within the URB based on the `wNdpIndex` value in NTH16. Only the start position of NDP16 was checked, so it was possible for even the fixed-length part of NDP16 to extend past the end of URB, leading to an out-of-bounds read. On iOS devices, the NDP16 header always directly follows NTH16. Rely on and check for this specific format. This, along with NCM-specific minimal URB length check that already exists, will ensure that the fixed-length part of NDP16 plus a set amount of DPEs fit within the URB. Note that this commit alone does not fully address the OoB read. The limit on the amount of DPEs needs to be enforced separately. Fixes: a2d274c ("usbnet: ipheth: add CDC NCM support") Cc: [email protected] Signed-off-by: Foster Snowhill <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Introduce an rx_error label to reduce repetitions in the header signature checks. Store wDatagramIndex and wDatagramLength after endianness conversion to avoid repeated le16_to_cpu() calls. Rewrite the loop to return on a null trailing DPE, which is required by the CDC NCM spec. In case it is missing, fall through to rx_error. This change does not fix any particular issue. Its purpose is to simplify a subsequent commit that fixes a potential OoB read by limiting the maximum amount of processed DPEs. Cc: [email protected] # 6.5.x Signed-off-by: Foster Snowhill <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Originally, the total NCM header size was computed as the sum of two vaguely labelled constants. While accurate, it wasn't particularly clear where they were coming from. Use sizes of existing NCM structs where available. Define the total NDP16 size based on the maximum amount of DPEs that can fit into the iOS-specific fixed-size header. This change does not fix any particular issue. Rather, it introduces intermediate constants that will simplify subsequent commits. It should also make it clearer for the reader where the constant values come from. Cc: [email protected] # 6.5.x Signed-off-by: Foster Snowhill <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Fix an out-of-bounds DPE read, limit the number of processed DPEs to the amount that fits into the fixed-size NDP16 header. Fixes: a2d274c ("usbnet: ipheth: add CDC NCM support") Cc: [email protected] Signed-off-by: Foster Snowhill <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Clarify that the "NCM" implementation in `ipheth` is very limited, as iOS devices aren't compatible with the CDC NCM specification in regular tethering mode. For a standards-compliant implementation, one shall turn to the `cdc_ncm` module. Cc: [email protected] # 6.5.x Signed-off-by: Foster Snowhill <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Expected behaviour: In case we reach scheduler's limit, pfifo_tail_enqueue() will drop a packet in scheduler's queue and decrease scheduler's qlen by one. Then, pfifo_tail_enqueue() enqueue new packet and increase scheduler's qlen by one. Finally, pfifo_tail_enqueue() return `NET_XMIT_CN` status code. Weird behaviour: In case we set `sch->limit == 0` and trigger pfifo_tail_enqueue() on a scheduler that has no packet, the 'drop a packet' step will do nothing. This means the scheduler's qlen still has value equal 0. Then, we continue to enqueue new packet and increase scheduler's qlen by one. In summary, we can leverage pfifo_tail_enqueue() to increase qlen by one and return `NET_XMIT_CN` status code. The problem is: Let's say we have two qdiscs: Qdisc_A and Qdisc_B. - Qdisc_A's type must have '->graft()' function to create parent/child relationship. Let's say Qdisc_A's type is `hfsc`. Enqueue packet to this qdisc will trigger `hfsc_enqueue`. - Qdisc_B's type is pfifo_head_drop. Enqueue packet to this qdisc will trigger `pfifo_tail_enqueue`. - Qdisc_B is configured to have `sch->limit == 0`. - Qdisc_A is configured to route the enqueued's packet to Qdisc_B. Enqueue packet through Qdisc_A will lead to: - hfsc_enqueue(Qdisc_A) -> pfifo_tail_enqueue(Qdisc_B) - Qdisc_B->q.qlen += 1 - pfifo_tail_enqueue() return `NET_XMIT_CN` - hfsc_enqueue() check for `NET_XMIT_SUCCESS` and see `NET_XMIT_CN` => hfsc_enqueue() don't increase qlen of Qdisc_A. The whole process lead to a situation where Qdisc_A->q.qlen == 0 and Qdisc_B->q.qlen == 1. Replace 'hfsc' with other type (for example: 'drr') still lead to the same problem. This violate the design where parent's qlen should equal to the sum of its childrens'qlen. Bug impact: This issue can be used for user->kernel privilege escalation when it is reachable. Reported-by: Quang Le <[email protected]> Signed-off-by: Quang Le <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: NipaLocal <nipa@local>

…limit==0 When limit == 0, pfifo_tail_enqueue() must drop new packet and increase dropped packets count of the qdisc. All test results: 1..16 ok 1 a519 - Add bfifo qdisc with system default parameters on egress ok 2 585c - Add pfifo qdisc with system default parameters on egress ok 3 a86e - Add bfifo qdisc with system default parameters on egress with handle of maximum value ok 4 9ac8 - Add bfifo qdisc on egress with queue size of 3000 bytes ok 5 f4e6 - Add pfifo qdisc on egress with queue size of 3000 packets ok 6 b1b1 - Add bfifo qdisc with system default parameters on egress with invalid handle exceeding maximum value ok 7 8d5e - Add bfifo qdisc on egress with unsupported argument ok 8 7787 - Add pfifo qdisc on egress with unsupported argument ok 9 c4b6 - Replace bfifo qdisc on egress with new queue size ok 10 3df6 - Replace pfifo qdisc on egress with new queue size ok 11 7a67 - Add bfifo qdisc on egress with queue size in invalid format ok 12 1298 - Add duplicate bfifo qdisc on egress ok 13 45a0 - Delete nonexistent bfifo qdisc ok 14 972b - Add prio qdisc on egress with invalid format for handles ok 15 4d39 - Delete bfifo qdisc twice ok 16 d774 - Check pfifo_head_drop qdisc enqueue behaviour when limit == 0 Signed-off-by: Quang Le <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: NipaLocal <nipa@local>

qdisc_tree_reduce_backlog() notifies parent qdisc only if child qdisc becomes empty, therefore we need to reduce the backlog of the child qdisc before calling it. Otherwise it would become a nop and result in UAF in DRR case (which is integrated in the following patch). Fixes: f8d4bc4 ("net/sched: netem: account for backlog updates from child qdisc") Cc: Martin Ottens <[email protected]> Reported-by: Mingi Cho <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Integrate the test case provided by Mingi Cho into TDC. All test results: 1..4 ok 1 ca5e - Check class delete notification for ffff: ok 2 e4b7 - Check class delete notification for root ffff: ok 3 33a9 - Check ingress is not searchable on backlog update ok 4 a4b9 - Test class qlen notification Cc: Mingi Cho <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Signed-off-by: Max Schulze <[email protected]> Tested-by: Max Schulze <[email protected]> Suggested-by: David Hollis <[email protected]> Reported-by: Sven Kreiensen <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Replace vmalloc allocations with kvmalloc since kvmalloc is more flexible in memory allocation Signed-off-by: Denis Kirjanov <[email protected]> Reviewed-by: Florian Westphal <[email protected]> Signed-off-by: NipaLocal <nipa@local>

The sanity check that both source and destination are set when symmetric RSS hash is requested is only relevant for ETHTOOL_SRXFH (rx-flow-hash), it should not be performed on any other commands (e.g. ETHTOOL_SRXCLSRLINS/ETHTOOL_SRXCLSRLDEL). This resolves accessing uninitialized 'info.data' field, and fixes false errors in rule insertion: # ethtool --config-ntuple eth2 flow-type ip4 dst-ip 255.255.255.255 action -1 loc 0 rmgr: Cannot insert RX class rule: Invalid argument Cannot insert classification rule Fixes: 13e5934 ("net: ethtool: add support for symmetric-xor RSS hash") Cc: Ahmed Zaki <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Gal Pressman <[email protected]> Signed-off-by: NipaLocal <nipa@local>

The number of MTL queues to use is specified by the parameter "snps,{tx,rx}-queues-to-use" from stmmac_platform layer. However, the maximum numbers of queues are constrained by upper limits determined by the capability of each hardware feature. It's appropriate to limit the values not to exceed the upper limit values and display a warning message. This only works if the hardware capability has the upper limit values. Fixes: d976a52 ("net: stmmac: multiple queues dt configuration") Signed-off-by: Kunihiko Hayashi <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Tx/Rx FIFO size is specified by the parameter "{tx,rx}-fifo-depth" from stmmac_platform layer. However, these values are constrained by upper limits determined by the capabilities of each hardware feature. There is a risk that the upper bits will be truncated due to the calculation, so it's appropriate to limit them to the upper limit values and display a warning message. This only works if the hardware capability has the upper limit values. Fixes: e7877f5 ("stmmac: Read tx-fifo-depth and rx-fifo-depth from the devicetree") Signed-off-by: Kunihiko Hayashi <[email protected]> Signed-off-by: NipaLocal <nipa@local>

…pecified When Tx/Rx FIFO size is not specified in advance, the driver checks if the value is zero and sets the hardware capability value in functions where that value is used. Consolidate the check and settings into function stmmac_hw_init() and remove redundant other statements. If FIFO size is zero and the hardware capability also doesn't have upper limit values, return with an error message. Signed-off-by: Kunihiko Hayashi <[email protected]> Signed-off-by: NipaLocal <nipa@local>

tc_actions.sh keeps hanging the forwarding tests. sdf@: tdc & tdc-dbg started intermittenly failing around Sep 25th Signed-off-by: NipaLocal <nipa@local>

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

This reverts commit 76d5d4c. Signed-off-by: NipaLocal <nipa@local>

kuba-moo force-pushed the to-test branch from 6bd5e75 to bdd05e2 Compare March 27, 2024 21:49

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 3 times, most recently from 4f22ee0 to 8a9a8e0 Compare March 28, 2024 04:46

kuba-moo force-pushed the to-test branch 11 times, most recently from 64c403f to 8da1f58 Compare March 29, 2024 00:01

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 3 times, most recently from 78ebb17 to 9325308 Compare March 29, 2024 02:14

kuba-moo force-pushed the to-test branch 6 times, most recently from c8c7b2f to a71aae6 Compare March 29, 2024 18:01

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 9325308 to 7940ae1 Compare March 29, 2024 18:12

kuba-moo force-pushed the to-test branch 2 times, most recently from d8feb00 to b16a6b9 Compare March 30, 2024 00:01

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 7940ae1 to 8f1ff3c Compare March 30, 2024 00:21

kuba-moo force-pushed the to-test branch 2 times, most recently from 4164329 to c5cecb3 Compare March 30, 2024 06:00

dark and others added 29 commits January 26, 2025 22:00

net: usb: asix: add FiberGecko DeviceID

3513648

Signed-off-by: Max Schulze <[email protected]> Tested-by: Max Schulze <[email protected]> Suggested-by: David Hollis <[email protected]> Reported-by: Sven Kreiensen <[email protected]> Signed-off-by: NipaLocal <nipa@local>

forwarding: set timeout to 3 hours

3eb253e

tc_actions.sh keeps hanging the forwarding tests. sdf@: tdc & tdc-dbg started intermittenly failing around Sep 25th Signed-off-by: NipaLocal <nipa@local>

profile patch

a4ce346

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

tc_action dbg

cfd3cc2

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

selftests: net: enable profiling

06b44d3

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Revert "mm/kmemleak: fix percpu memory leak detection failure"

10fb0d4

This reverts commit 76d5d4c. Signed-off-by: NipaLocal <nipa@local>

Merge branch 'net-next-2025-01-27--06-00' into HEAD

942eb71

kuba-moo force-pushed the to-test branch from 1920139 to 942eb71 Compare January 27, 2025 06:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

netdev CI testing #6666

netdev CI testing #6666

kuba-moo commented Mar 27, 2024

netdev CI testing #6666

Are you sure you want to change the base?

netdev CI testing #6666

Conversation

kuba-moo commented Mar 27, 2024